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Genome Sequences of 228 Shiga Toxin- Producing Escherichia coli 
Isolates and 12 Isolates Representing Other Diarrheagenic E. coli 
Pathotypes 
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Shiga toxin-producing Escherichia coli (STEC) are a common cause for food-borne diarrheal illness outbreaks and sporadic 
cases. Here, we report the availability of the draft genome sequences of 228 STEC strains representing 32 serotypes with known 
pulsed-field gel electrophoresis (PFGE) types and epidemiological relationships, as well as 12 strains representing other diar- 
rheagenic E. coli pathotypes. 
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The rapidly decreasing cost of next-generation sequencing (NGS) 
will facilitate its application for real-time surveillance in the near 
future. PulseNet, the molecular subtyping network for food-borne 
disease surveillance, currently relies on pulsed-field gel electrophore- 
sis (PFGE) to define clusters of illness (1). In order to use NGS as a 
primary method for cluster detection, a thorough understanding of 
the genetic diversity in the target population is needed. Shiga toxin- 
producing Escherichia coli (STEC) are among the pathogens tracked 
by PulseNet. In this report, we announce the availability of the draft 
sequences of a carefully selected set of STEC strains that should enable 
us to gain insights into the sequence diversity within an outbreak or a 
carrier state and among epidemiologically unrelated isolates within a 
serotype and between serotypes. 

We sequenced 228 STEC strains representing 32 serotypes with 
known PFGE types and epidemiological relationships. The strain 
set included a total of 50 isolates from five outbreaks, 1 1 isolates 
from a long-term carrier, and epidemiologically unrelated strains. 
Twelve strains of other diarrheagenic E. coli pathotypes were in- 
cluded as outliers. Genomic DNA from each strain was isolated 
using the ArchivePure DNA cell/tissue kit (5Prime, Hamburg, 
Germany). All 240 strains were sequenced to a minimum depth of 
100 X with the HiSeq 2000 or GAIIx (Illumina, San Diego, CA, 
USA) using the TrueSeq DNA LT sample prep kit (Illumina) for 
DNA library preparation and 100-bp paired-end read chemistry. 
Additionally, 82 strains were sequenced with the PacBio RS system 
(Pacific Biosciences, Menlo Park, CA) using C2 chemistry and 
four single-molecule real-time (SMRT) cells per genome. 

Raw read quality checks were performed on the 240 samples 
using FastQC (http://www.bioinformatics.bbsrc.ac.uk/projects/ 
fastqc) and in-house Perl scripts/lava programs. Primary analysis 
for the Illumina data was performed using CLC Genomics Work- 
bench 5.5.1 (Aarhus, Denmark). The raw read files for each sam- 
ple were trimmed with length (minimum, 50 bp) and quality score 



(0.02) filters. The trimmed reads were assembled into contigs with 
specific parameter settings (length fraction, 0.8; similarity frac- 
tion, 0.8; minimum contig length, 450 bp), and assembly statistics 
were parsed out in a table format using in-house scripts. The 
PacBio data analysis was performed using the whole-genome se- 
quencing (WGS) assembler toolkit (2). Error correction of the 
filtered subreads was performed with the paired-end Illumina 
data (-60 X data was used) using the WGS toolkit PacBioToCA 
script, followed by de novo assembly using the runCA script. The 
best assembly for each of these 82 samples was chosen based on the 
number of contigs, N 50 value, and genome length. 

The average genome size for the sequenced strains was 
5,282,291 bp (range, 4,527,885 to 5,712,627). For the 240 Illumina 
assemblies, the average number of contigs was 211 (range, 68 to 
465), and the average N 50 was 128,850 (range, 26,435 to 230,877). 
For the 82 PacBio hybrid assemblies, the average number of con- 
tigs was 207 (range, 31 to 207), and the average N 50 was 172,854 
(range, 31,094 to 1,414,730). 

Nucleotide sequence accession numbers. The draft genome 
sequences for these 240 diarrheagenic E. coli strains have been 
deposited in DDBJ/ENA/GenBank under the accession numbers 
listed in Table 1. 
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TABLE 1 NCBI accession numbers for 240 E. coli draft genomes TABLE 1 (Continued) 



Strain ID 


Serotype 


NCBI accession no. 


Strain ID 


Serotype 


NCBI accession no. 


00-3279 


078:H12 


JFBE00000000 


2009EL1449 


0157:H7 


JHGFOOOOOOOO 


01-3076 


OHl:NM 


JFGU00000000 


2009EL1705 


0157:H7 


JHGEOOOOOOOO 


01-3147 


045:H2 


JHOA00000000 


2009EL1913 


0157:H7 


JHGDOOOOOOOO 


02-3012 


081:NM 


JHNZ00000000 


2009EL2109 


0157:H7 


JHGCOOOOOOOO 


02-3404 


028ac:NM 


JHNY00000000 


2009EL-2169 


OHl:H8 


JHGBOOOOOOOO 


03-3227 


0121:H19 


JHNX00000000 


2010C-3051 


026:H11 


JHGAOOOOOOOO 


03-3269 


0174:H21 


JHNW00000000 


2010C-3053 


OHl:NM 


JHFZOOOOOOOO 


03-3458 


OH9:H4 


JHNV00000000 


2010C-3214 


O103:Hll 


JHFYOOOOOOOO 


03-3484 


OHl:NM 


JHNU00000000 


2010C-3472 


026:H11 


JHFXOOOOOOOO 


03-3500 


026:H11 


JHNT00000000 


2010C-3507 


0145:NM 


JHFWOOOOOOOO 


04-3023 


O103:Hll 


JHOD00000000 


2010C-3508 


0145:NM 


JHFVOOOOOOOO 


04-3038 


0174:H8 


JHOC00000000 


2010C-3509 


0145:NM 


JHFUOOOOOOOO 


04-3211 


OHl:NM 


JHNS00000000 


2010C-3510 


0145:NM 


JHFTOOOOOOOO 


05-3646 


026:H11 


JHOE00000000 


2010C-3511 


0145:NM 


JHFSOOOOOOOO 


06-3003 


0121:H19 


JHNR00000000 


2010C-3516 


0145:NM 


JHFROOOOOOOO 


06-3256 


OH8:H16 


JHNQ00000000 


2010C-3517 


0145:NM 


JHFQOOOOOOOO 


06-3325 


069:H11 


JHNP00000000 


2010C-3518 


0145:NM 


JHFPOOOOOOOO 


06-3464 


026:H11 


JHNOOOOOOOOO 


2010C-3521 


0145:NM 


JHFOOOOOOOOO 


06-3484 


0145:NM 


JHNNOOOOOOOO 


2010C-3526 


0145:NM 


JHFNOOOOOOOO 


06-3501 


079:H7 


JHNMOOOOOOOO 


2010C-3609 


0121:H19 


JHFMOOOOOOOO 


06-3555 


055:H7 


JHNLOOOOOOOO 


2010C-3794 


0121:H19 


JHFLOOOOOOOO 


06-3612 


OH8:H16 


JHNKOOOOOOOO 


2010C-3840 


0121:H19 


JHFKOOOOOOOO 


06-3691 


091:H14 


JHNJOOOOOOOO 


2010C-3871 


026:H11 


JHFJOOOOOOOO 


06-3745 


0157:H7 


JHNIOOOOOOOO 


2010C-3876 


045:H2 


JHFIOOOOOOOO 


06-3822 


0121:H19 


JHNHOOOOOOOO 


2010C-3902 


026:H11 


JHFHOOOOOOOO 


06-4039 


0157:H7 


JHNGOOOOOOOO 


2010C-3977 


OHl:NM 


JHFGOOOOOOOO 


07-3091 


0157:H7 


JHNFOOOOOOOO 


2010C-4086 


OHl:NM 


JHFFOOOOOOOO 


07-3391 


0157:H7 


JHNEOOOOOOOO 


2010C-4221 


OHl:NM 


JHFEOOOOOOOO 


07-4224 


OH3:H21 


JHOBOOOOOOOO 


2010C-4244 


026:H11 


JHFDOOOOOOOO 


07-4281 


069:H11 


JHLAOOOOOOOO 


2010C-4254 


0121:H19 


JHFCOOOOOOOO 


08-3037 


0157:H7 


JHKZOOOOOOOO 


2010C-4347 


026:NM 


JHFBOOOOOOOO 


08-3527 


0157:H7 


JHKYOOOOOOOO 


2010C-4430 


026:H11 


JHNDOOOOOOOO 


08-3651 


OH8:H16 


JHKXOOOOOOOO 


2010C-4433 


O103:H2 


JHNCOOOOOOOO 


08-4169 


0157:H7 


JHKWOOOOOOOO 


2010C-4529 


O103:H25 


JHNBOOOOOOOO 


08-4270 


0145:NM 


JHKVOOOOOOOO 


2010C-4557C2 


0145:NM 


JHNAOOOOOOOO 


08-4487 


OHl:NM 


JHKUOOOOOOOO 


2010C-4558 


0177:NM 


JHMZOOOOOOOO 


08-4529 


0157:H7 


JHHIOOOOOOOO 


2010C-4592 


OHl:NM 


JHMYOOOOOOOO 


08-4540 


0157:NM 


JHHHOOOOOOOO 


2010C-4622 


OHl:NM 


JHMXOOOOOOOO 


08-4661 


069:H11 


JHHGOOOOOOOO 


2010C-4715 


OHl:NM 


JHMWOOOOOOOO 


2009C-3227 


091:H14 


JHHFOOOOOOOO 


2010C-4732 


0121:H19 


JHMVOOOOOOOO 


2009C-3279 


O103:H2 


JHHEOOOOOOOO 


2010C-4735 


OHl:NM 


JHMUOOOOOOOO 


2009C-3292 


0145:H28 


JHHDOOOOOOOO 


2010C-4746 


OHl:NM 


JHMTOOOOOOOO 


2009C-3299 


0121:H7 


JHHCOOOOOOOO 


2010C-4788 


026:NM 


JHMSOOOOOOOO 


2009C-3307 


0123:H11 


JHHBOOOOOOOO 


2010C-4799 


OHl:NM 


JHMROOOOOOOO 


2009C-3601 


069:H11 


JHHAOOOOOOOO 


2010C-4818 


OHl:NM 


JHMQOOOOOOOO 


2009C-3612 


026:H11 


JHGZOOOOOOOO 


2010C-4819 


026:H11 


JHMPOOOOOOOO 


2009C-3686 


045:H2 


JHGYOOOOOOOO 


2010C-4824 


0121:H19 


JHMOOOOOOOOO 


2009C-3689 


026:H11 


JHGXOOOOOOOO 


2010C-4834 


026:H11 


JHMNOOOOOOOO 


2009C-3745 


091:NM 


JHGWOOOOOOOO 


2010C-4874 


0165:H25 


JHMMOOOOOOOO 


2009C-3996 


026:H11 


JHGVOOOOOOOO 


2010C-4966 


0121:H19 


JHMLOOOOOOOO 


2009C-4006 


OHl:NM 


JHGUOOOOOOOO 


2010C-4979C1 


0157:H7 


JHMKOOOOOOOO 


2009C-4050 


0121:H19 


JHGTOOOOOOOO 


2010C-4989 


0121:H19 


JHMJOOOOOOOO 


2009C-4052 


OHl:NM 


JHGSOOOOOOOO 


2010C-5028 


026:H11 


JHMIOOOOOOOO 


2009C-4126 


OHl:H8 


JHGROOOOOOOO 


2010C-5034 


0153:H2 


JHMHOOOOOOOO 


2009C-4258 


0157:H7 


JHGQOOOOOOOO 


2010EL1058 


0121:H19 


JHMGOOOOOOOO 


2009C-4446 


OH8:H16 


JHGPOOOOOOOO 


2010EL-1699 


026:H11 


JHMFOOOOOOOO 


2009C-4646 


091:H21 


JHGOOOOOOOOO 


2010EL-2044 


0157:H7 


JHMEOOOOOOOO 


2009C-4659 


0121:H19 


JHGNOOOOOOOO 


2010EL-2045 


0157:H7 


JHMDOOOOOOOO 


2009C-4747 


026:H11 


JHGMOOOOOOOO 


2011C-3072 


0121:H19 


JHMCOOOOOOOO 


9009(^-47^0 


Ol 9 1 -HI 9 


TTTnT 00000000 


901 1 P-31 OS 

ZUIIVj J1UO 


Ol 71 -HI 9 


THMROOOOOOOO 


2009C-4760 


026:H11 


JHGKOOOOOOOO 


2011C-3170 


OHl:NM 


JHMAOOOOOOOO 


2009C-4780 


045:H2 


JHGJOOOOOOOO 


2011C-3216 


0121:H19 


JHLZOOOOOOOO 


2009C-4826 


026:H11 


JHGIOOOOOOOO 


2011C-3270 


026:H11 


JHLYOOOOOOOO 


2009EL1302 


0121:H19 


JHGHOOOOOOOO 


2011C-3282 


026:H11 


JHLXOOOOOOOO 


2009EL1412 


0121:H19 


JHGGOOOOOOOO 


2011C-3362 


OHl:NM 


JHLWOOOOOOOO 
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TABLE 1 (Continued) 






oil am IL> 


Serotype 


NCBI accession no. 


oir am ll> 


Serotype 


NCBI accession no. 


2011C-3387 


026:H11 


JHLV00000000 


K1795 


0157:H7 


JHJBOOOOOOOO 


2011C-3453 


OHl:H8 


JHLU00000000 


K1796 


0157:H7 


JHJAOOOOOOOO 


2011C-3500 


0121:H19 


JHLT00000000 


K1845 


0157:H7 


JHIZOOOOOOOO 


2011C-3506 


026:H11 


JHLS00000000 


K1921 


0157:H7 


JHIYOOOOOOOO 


2011C-3537 


0121:H19 


JHLR00000000 


K1927 


0157:H7 


JHIXOOOOOOOO 


2011C-3573 


OHl:NM 


JHLQ00000000 


K2188 


0157:H7 


JHIWOOOOOOOO 


2011C-3602 


0156:H25 


JHLP00000000 


K2191 


0157:H7 


JHIVOOOOOOOO 


2011C-3632 


OllliNM 


JHLO00000000 


K2192 


0157:H7 


JHIUOOOOOOOO 


2011C-3655 


026:H11 


JHLN00000000 


K2324 


0157:H7 


JHITOOOOOOOO 


2011C-3679 


OllhNM 


JHLM00000000 


K2581 


0157:H7 


JHISOOOOOOOO 


2011C-3750 


O103:H2 


JHLL00000000 


K2622 


0157:H7 


JHIROOOOOOOO 


201 1EL- 1107 


0157:H7 


JHLK00000000 


K2845 


0157:H7 


JHIQOOOOOOOO 


2011EL-1675A 


O104:H4 


JHLJ00000000 


K2854 


0157:H7 


JHIPOOOOOOOO 


2011EL-2090 


0157:H7 


JHLI00000000 


K4396 


0157:H7 


JHIOOOOOOOOO 


2011EL-2091 


0157:H7 


JHLH00000000 


K4405 


0157:H7 


JHINOOOOOOOO 


2011EL-2092 


0157:H7 


JHLG00000000 


K4406 


0157:H7 


JHIMOOOOOOOO 


2011EL-2093 


0157:H7 


JHLF00000000 


K4527 


0157:H7 


JHILOOOOOOOO 


2011EL-2094 


0157:H7 


JHLE00000000 


K5198 


0121:H19 


JHIKOOOOOOOO 


2011EL-2096 


0157:H7 


JHLD00000000 


K5269 


0121:H19 


JHIJOOOOOOOO 


2011EL-2097 


0157:H7 


JHLC00000000 


K5418 


0157:H7 


JHIIOOOOOOOO 


2011EL-2098 


0157:H7 


JHLB00000000 


K5448 


0157:H7 


JHIHOOOOOOOO 


2011EL-2099 


0157:H7 


JHKT00000000 


K5449 


0157:H7 


JHIGOOOOOOOO 


2011EL-2101 


0157:H7 


JHKS00000000 


K5453 


0157:H7 


JHIFOOOOOOOO 


2011EL-2103 


0157:H7 


JHKR00000000 


K5460 


0157:H7 


JHIEOOOOOOOO 


2011EL-2104 


0157:H7 


JHKQ00000000 


K5467 


0157:H7 


JHIDOOOOOOOO 


2011EL-2105 


0157:H7 


JHKP00000000 


K5602 


0157:H7 


JHICOOOOOOOO 


2011EL-2106 


0157:H7 


JHKO00000000 


K5607 


0157:H7 


JHIBOOOOOOOO 


2011EL-2107 


0157:H7 


JHKN00000000 


K5609 


0157:H7 


JHIAOOOOOOOO 


2011EL-2108 


0157:H7 


JHKM00000000 


K5806 


0157:H7 


JHHZOOOOOOOO 


2011EL-2109 


0157:H7 


JHKL00000000 


K5852 


0157:H7 


JHHYOOOOOOOO 


2011EL-2111 


0157:H7 


JHKK00000000 


K6590 


0157:H7 


JHHXOOOOOOOO 


2011EL-2112 


0157:H7 


JHKJ00000000 


K6676 


0157:H7 


JHHWOOOOOOOO 


2011EL-2113 


0157:H7 


JHKIOOOOOOOO 


K6687 


0157:H7 


JHHVOOOOOOOO 


2011EL-2114 


0157:H7 


JHKHOOOOOOOO 


K6722 


OHl:NM 


JHHUOOOOOOOO 


2011EL-2286 


0157:H7 


JHKGOOOOOOOO 


K6723 


OHl:NM 


JHHTOOOOOOOO 


2011EL-2287 


0157:H7 


JHKFOOOOOOOO 


K6728 


OHl:NM 


JHHSOOOOOOOO 


2011EL-2288 


0157:H7 


JHKEOOOOOOOO 


K6890 


OHl:NM 


JHHROOOOOOOO 


2011EL-2289 


0157:H7 


JHKDOOOOOOOO 


K6895 


OHl:NM 


JHHQOOOOOOOO 


2011EL-2290 


0157:H7 


JHKCOOOOOOOO 


K6897 


OHl:NM 


JHHPOOOOOOOO 


2011EL-2312 


0157:H7 


JHKBOOOOOOOO 


K6898 


OHl:NM 


JHHOOOOOOOOO 


2011EL-2313 


0157:H7 


JHKAOOOOOOOO 


K6904 


OHl:NM 


JHHNOOOOOOOO 


94-3025 


0104:H21 


JHJZOOOOOOOO 


K6908 


OHl:NM 


JHHMOOOOOOOO 


98-3133 


0157:H16 


JHJYOOOOOOOO 


K6915 


OHl:NM 


JHHLOOOOOOOO 


99-3124 


086:H34 


JHJXOOOOOOOO 


K7140 


0157:H7 


JHHKOOOOOOOO 


99-3165 


06:H16 


JHJWOOOOOOOO 


F8704-2 


Q39:NM 


JHHJOOOOOOOO 


E2539C1 


025:NM 


JHJVOOOOOOOO 








F5656C1 


06:H16 


JHJUOOOOOOOO 








F6142 


0157:H7 


JHJTOOOOOOOO 








F6627 


0111:H8 


JHJSOOOOOOOO 








F6714 


0121:H19 


JHJROOOOOOOO 








F6749 


0157:H7 


JHJQOOOOOOOO 








F6750 


0157:H7 


JHJPOOOOOOOO 








F6751 


0157:H7 


JHJOOOOOOOOO 








F7350 


0157:H7 


JHJNOOOOOOOO 








F7377 


0157:H7 


JHJMOOOOOOOO 








F7384 


0157:H7 


JHJLOOOOOOOO 








F7410 


0157:H7 


JHJKOOOOOOOO 








F9792 


0169:H41 


JHJJOOOOOOOO 








G5303 


0157:H7 


JHJIOOOOOOOO 








H2495 


0157:H7 


JHJHOOOOOOOO 








H2498 


0157:H7 


JHJGOOOOOOOO 








K1420 


0157:H7 


JHJFOOOOOOOO 








K1516 


015:H18 


JHJEOOOOOOOO 








K1792 


0157:H7 


JHJDOOOOOOOO 








K1793 


0157:H7 


JHJCOOOOOOOO 
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